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1 Introduction 

After TT, and then e, or perhaps the golden ratio 0, the Euler-Mascheroni 
number 7 stands among the most famous mathematical constants. We aim 
here for a formulation of 7 that makes it accessible to the widest possible 
pubhc. 

Now the public knows tt best by dint of its connection to computing cir- 
cular perimeters and areas. Though a mathematician would analyze these 
computations by employing the apparatus of limits a la Cauchy, merely com- 
municating the meaning of vr should not depend on Cauchy's sophisticated, 
abstract, universal, rigorous formulation of limits. Cauchy's approach syn- 
thesized diverse mathematical discourses, but it did not abolish them. The 
public face of a mathematical constant should preferably not depend on fa- 
miliarity with Cauchy style limits. 

While a mathematical constant will possess a single, definite value, it may 
admit many interpretations according to the diverse contexts where it arises. 
For example, formulated properly, we may say that, with probability 6/7r^, 
two natural numbers chosen at random share no factor greater than 1. That n 
occurs here despite the lack of any apparent connection to circles beautifully 
exemplifies the sort of excitement associated with pure mathematics! 

The public does not know e as well as it knows vr, but e too admits 
accessible narrative interpretations. If the public knew hyperbolas as well as 
it knows circles, one could effectively characterize e as that number (greater 
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than 1) such that that the area under y — 1/x over the interval [1, e] equals 
1. Closer to practical concerns, one can observe that $1 left in the bank for a 
year at 100% interest, compounded continuously, grows to $e. A seemingly 
very different take on e involves derangements. Supposing that n people 
participate in a Christmas party grab bag, we can ask for the probability 
that no one gets their own gift back. All the probabilities with n even exceed 
all the probabilities with n and only the number 1/e lies in between. 

The Euler-Mascheroni constant 7 cries out for a canonical narrative in- 
terpretation suitable for public consumption. Steven R. Finch's encyclopedic 
Mathematical Constants lists several candidates, where the most compelling 
takes the form 



a result of de la Vallee Poussin. As Finch paraphrases de la Vallee Poussin's 
result: 

... if a large integer n is divided by each integer 1 < /c < n, then 
the average fraction by which the quotient n/k falls short of the 
next integer is not 1/2, but 7! 

As an elementary interpretation of 7, de la Vallee Poussin result has two 
nice features not shared by Finch's other examples. First, 7 occurs more 
or less directly, rather than embedded in a formula such as . Second, 
de la Vallee Poussin's formula for 7 refers only to basic arithmetic and in 
particular avoids mention of natural logarithms. 

We offer here a novel elementary interpretation (indeed a vast family 
of such interpretations) of 7 sharing the stated advantages of de la Vallee 
Poussin's and the additional advantage, perhaps, that it arises very naturally 
if one considers a very modest variation on a very familiar mathematical 
situation. 

We mean to address two sorts of readers at once, namely those who 
have had (or remember) only high school mathematics and would like to 
learn about the Euler-Mascheroni constant from scratch, and those who know 
enough calculus to digest the usual definition and wish to understand its 
equivalence with our reformulation. The former may just skip without loss 
some remarks obviously directed at the latter, who should exercise patience 
with details spelled out for the former. 
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We begin by recalling the usual formula for the Euler-Mascheroni and 
then offer an alternative formula in the same spirit which nevertheless elim- 
inates the explicit appearance of natural logarithms. Our first attempt at 
attaching a very simple, compelling narrative interpretation to our formula 
for 7 produced only a fallacy, albeit an instructive one. Rather than suppress 
this initial failure, we start there, so that the reader will appreciate the mildly 
technical but unavoidable modification required for a valid interpretation. 

2 The standard definition of the Euler-Mascheroni 
constant 

We begin by explaining in elementary terms the usual definition 




We wish to interpret 7 geometrically. For this purpose it does no harm to 
make the modification 




or (after reindexing) 




Note that the general term of the original sequence and of the (first formu- 
lation of the) new sequence difi'er by and the difference approaches as 
n grows, justifying the modification. 

The (reindexed) new sequence leads to the area of the shaded region in 
the following diagram: 
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Here the curve in the diagram represents the graph oi y = 1/x. Indeed 
the first term of the new sequence gives the area of the leftmost black wedge, 
the second term the area of the two leftmost wedges, and generally, the ri*'^ 
term the area of the n leftmost wedges. Explicitly, the sum Y^'^=i i gives 
the area of the n leftmost rectangles and ln(n + 1) means the area under the 
curve and within these rectangles. Taking the limit gives the area of all the 
wedges. 

Now imagine all the wedges sliding horizontally to the left 




until we have the stacked vertically within the square, our original left most 
rectangle: 
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From this picture we see (or at least glean the tools needed to prove) the 
finitcncss of 7. As a subset of the unit square it must have an area between 
and 1, and the picture even makes clear that the area of all the wedges must 
exceed .5. Moreover the area of the first n wedges falls short of 7 by no more 
than 1/n (since the remaining wedges fit in 1 by 1/n rectangle), but also by 
at least l/2n (since they fill more than half of that rectangle) . 

Observe that, as with vr, we can interpret 7 as the area of a region in 
the plane that can construct explicitly. Of course this region seems highly 
artificial compared with the unit circle. To a student of integral calculus the 
region should seem less unnatural. In that context, 7 bounds the error that 
occurs when approximating the areas defining natural logarithms of natural 
numbers by means of upper sums. Of course one can approximate a give area 
by many different upper sums, but these upper sums often arise in their own 
right, as harmonic sums Y^k=i V^- One often has occasion to turn the story 
around, and using (sophisticated but easily manageable natural logarithms 
to approximate (elementary but awkward) harmonic sums. As an a priori 
estimate of the error involved, 7 can help us improve such approximations, 
and in this role it enters many formulas. 

3 Getting rid of the logarithms 

The following pictures suggest some calculations to approximate 7 which 
don't involve logarithms, and thus lead to a way of framing 7 for an audience 
that doesn't know about logarithms (and doesn't want to hear about them): 



By way of explanation, we would like to estimate the total area of all the 
wedges without computing exactly the area under the curve. We do this by 
now also approximating the region under the graph by a union of rectangles, 
but we let these approximations get more refined as we go. 

As far as concerns estimating 7, we now have two sources of error. First, 
the n**^ picture only takes account of the first n wedges. Second, we have 
unwanted area now below the various wedges. 

We have already bounded the magnitude of the first type of error by 1/n. 
We can also approximate the second type of error by sliding wedges, this 
time the new wedges we have created under the graph. In the n^^ pictures, 
these will all slide horizontally to fit inside a l/n by 1 rectangle. So l/n also 
bounds the second type of error. These two types of error, moreover, carry 
opposite signs, so certainly 1/n bounds the total error 

Numerically, the area of the regions in the three diagrams equal, as the 

^ In actually the two types of error tend to cancel. It turns out that times the error 
approaches 2/3. Our reformulation converges to 7 rather must faster than the original 
definition. 
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reader may easily check, 
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or in general 
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for the n — V'^ picture. 

In words, to approximate 7, for g = n^, we sum the reciprocals of numbers 
less than y/g and subtract off the reciprocals of all numbers greater than y/g 
up to q. Indeed we need not require making q a perfect square. We see this 
easily by comparing the recipe applied to a general q with the recipe applied 
to the largest square below it. 

So, as a slogan, for large numbers g, 7 approximates the sum of the 
reciprocals of the numbers below the square root of g minus the sum of the 
reciprocals of the numbers above the square root of g, up to g. 



4 A fallacy, first 

Roughly speaking, for random g, the probability that d divides g equals 1/d 
(since dividing g by can leave d possible remainders, all equally likely, with 
just one among them)Jl In probability theory one typically introduces a 
quantity that equals 1 when an event occurs and when it doesn't. The 
expectation of this sort of quantity (intuitively, its value on the average) co- 
incides with its probability. The virtue of working with expectations rather 
than directly with probabilities lies in the linearity of expectation: the ex- 
pectation of a sum equals the sum of the expectations. 

^We must say "roughly speaking" because we cannot make literal sense of "for random 
g" since the set of all natural numbers does not carry any uniform probability distribution. 
We may of course speak of a random q between 1 and B, but depending upon the B, the 
probability may not equal exactly l/d; the larger the _B, though, the smaller the error. 
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So suppose we have a set D = {di, . . . ,dj}. Again, roughly speaking, the 
expected number of elements of D that divide a random q should equal 

1 1 
di dj 

Notice that when D consists of many consecutive natural numbers, the 
expected number of elements of D that divide a random q has the form of 
the sort of quantities that come into our approximations for 7. 

This perhaps suggests asking if 7 approximates the expectation of Z, 
defined as the number of divisors of q below minus the number of 

divisor of q above y/q^ 

Z does indeed have an expectation, but its expectation turns out equal 
to 0, not 7! 

Indeed, if d divides q, so does q/d, and if one lies below y/q the other lies 
above, and vice versa. For example, if d < ^fq and also q/d < ^fq, we have 

q = d- {q/d) < {y/qf = q, 

a contradiction, and similarly for d,q/d > ^fq. Thus every number q has 
exactly the name number of divisors below ^fq as above. 

Of course, the reader already trained to refuse even to hear all but the 
most rigorous analysis will find no fallacy here. However, in mathematics, 
our type of heuristic reasoning does often lead, after careful formulation, to 
true statements, albeit often these statements turn out much harder to prove 
than the heuristics suggest. So even though we made clear when we left the 
realm of rigorous reasoning, perhaps it still comes as a surprise that we have 
failed so badly, that the gaps we left do not admit any repair. 

The reader may well wish to think upon the question of what sort of 
burden a failed heuristic imposes. We have proved that it lead us to a wrong 
conclusion. Generally speaking we don't feel we need to explain why erro- 
neous proofs lead to false conclusions! Nevertheless, when an erroneous proof 
depends on the unproved assumption that certain quantities vary indepen- 
dently when in fact they don't, we ought enquire into the nature of their 
interdependence. Alternatively, and we take this approach here, we can see 
if can rescue the heuristic by some slight change of the situation. 

We surely can make perfect sense of "the expected number of elements of 
D = {di, . . . , dj} that divide a random q equals + " " " + ^" provided that 

■^We do not have to end with "up to q" since no number larger than q divides q. 
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we keep D fixed, bound q, and accept some small error that tends to vanish 
as the size of the bound on q grows. But our purported interpretation of 7 
had the "D" varying along with q. 

This suggests a first, but admittedly ugly, fix. First fix q. Now given 
another quantity Q, consider, Zg, the number of divisors of Q minus than 
^ minus the number of divisors of Q between y/g and q. The expectation of 
Zq takes the form of one of our approximates to 7, but we must let q grow and 
take a bald limit to get 7 itself, so not the stuff of a popular interpretation. 

5 A surprisingly satisfactory fix 

We shall now formulate a family of valid probabilistic interpretations of 7, 
all very much in the spirit of the fallacious one, albeit just slightly more 
complicated. 

Theorem 1 Let F : N — >■ M stand for any function which 

a) F monotonically weakly increases; 

b) F tends to infinity; and 

c) such that q/F{q) tends to infinity. 



Let Zp^q) equal the number of divisors of q less than JF[q) minus the 



Considering our original goal, a popular interpretation of 7, we could 
perhaps just set = ^/x. We then get 

7 means the average by which the count of divisors of a number 
that sit below its fourth root exceeds the count of divisors that 
lie between the fourth root and the square root. 

The gist of the previous section consists in telling us that we cannot 
entirely dispense with the condition that n/F{n) tends to infinity, since the 
conclusion fails when taking F{n) = n. 

Proof 

In the following diagram, 

^Of course by "on the average" we mean taking the Umit of averages that arise with q 
bounded by B as i? increases. 
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circles in row r (counting up) have area 1/r. We have colored green those 



circles in column q having row number less than J F{n), and those with row 



number in the half-open interval [\J F{n), F{n)) red. (While we have in mind 
a general F, satisfying the conditions of Theorem 1, the diagram shows the 
situation specifically for F{x) = y/x.) 

Consider a particular column. By our previous worlj^l the excess of the 
green area over the red area takes the form of an approximation to 7 with the 
approximations approaching perfection as we move to the right, on account 
of the assumption that F{q) grows without bound. So certainly if we consider 
together all the columns up to column B, the total green area less the total 
red area divided by B approaches 7 as -B tends toward infinity. 
Now compare the following diagram with the previous: 



Here all circles now have area 1, but this time we only color circles if the 
row number divides the column number. 

For the second diagram, for a given column, the excess of green area over 
red area constitutes just the sort of quantity we have claimed averages to 7 
in the long run. 

It suffices to show that if we consider together all the columns up to 
column B, the total green area less the total red area divided by B approaches 
7 as -B tends toward infinity. 

While the two diagrams appear quite different column-by-column, a row- 

^ Just for the sake of simplicity now, here we choose to approximate 7 by the sum of 
the reciprocals of the numbers below the square root of q minus the sum of the reciprocals 
of the numbers equal to or above the square root of q 
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by-row comparison works out quite simply, as follows. 

Fix a row number, say r, and consider the corresponding r-rows in the 
two diagrams, with the aim of estimating the discrepancy between, first, the 
total red areas they hold, and second, their total green areas. 

In the r-row of the second diagram, consider any colored circle if one 
occurs. Call it Ci; Ci has area 1. Write C2 for the next colored circle to its 
right (in the infinite version of the second diagram). Next, consider the circle 
Ci in the first diagram corresponding position-wise to Ci together with the r 
circles in the first diagram in positions corresponding to those circles strictly 
between Ci and C2 (all these diagram 1 circles together have total area 1). 

The previous paragraph shows that if the total red areas in the r-rows of 
the diagrams differ, they differ on account of what happens when, moving 
left to right say, as we enter and leave the first diagram's "red island". 

Thus the red area discrepancy in row r cannot exceed magnitude 1, and 
likewise for the green area discrepancy. 

As for the green area minus the red area in the two r-rows, the discrepancy 
between the diagram one difference and the diagram two difference cannot 
exceed magnitude 2. 

For rows with no colored circles in either diagram we obviously have no 
discrepancy at all, and at most F{B) rows have colored circles. 

We have now bounded the total green area minus red area discrepancy 
(for all rows) between the two diagrams by 2F{B). By assumption, 2F{B)/B 
approaches as B grows. Thus, as B increases, the values for the average 
green area minus red area per column for two types of diagrams converge. 

Since this average approaches 7 for diagrams of the first type, it also does 
for diagrams of the second type, as desired. 

6 The case of F{x) = ax 

Theorem 1 docs not speak to the case of F{x) = ax for any a G (0,1); 
such an F could produce as many as ax rows that exhibit a discrepancy. 
Nevertheless we can make the proof technique yield up a complete analysis. 

Theorem 2 Fix a e (0, 1). Write A for the average number of divisors 
of n that lie in (0, ^/ml) minus the number of that lie in (-\/cm, an). Then 

"1 1 
A=Y. --ln(-). 
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Before turning to the proof, we offer a few remarks. 
First, except when 1/a has integral value, f^— ^] = [-J, which looks a 
bit simpler. 

The formula correctly predicts a balance between divisors above and be- 
low the square root of n, the a — 1 case. Moreover, as a approaches 0, 
the values of the formula converge to 7, just as one might hope based on 
Theorem 1. 

The discontinuities in the graph below 




Graph of A as a function of a 
come as no surprise. As a shrinks past l/fc, we lose, from the second diagram, 
divisors of n of the form n/k when they occur, which they do for one n out 
of k. For those n large compared to k we will have these divisors colored red 
(since n/k will exceed ^(an)), so we expect the graph to jump up (as we 
move to the left) by 1/k. 

For all a < 1 we have A > 0, so we expect, on the average, more divisors 
in (0, y/an) than in {y/an, an). This leads us to guess that numbers n with 
more divisors in (0, y/an) than in {y/an, an) should occur with a positive 
density. But this does not follow immediately. Logically speaking, relatively 
rare numbers with many more divisors in (0, ^/an) than in {y/an, an) might 
possibly make all the necessary contribution to the average behavior. Nev- 
ertheless, such number cannot occur too rarely, since, overall, relatively few 
numbers n possess even a total number of divisors large compared with In n. 

Because the graph oscillates about the value 7, for infinitely many special 
values of a (namely those of the form e"'^^^^ i-i/'^))^ y[ takes the value 7, the 
right answer for the wrong reason, if you will. Note that this characterizes 



12 



7: the onhj average realized for infinitely many values of a. 

One might wonder about the average value of the average if we choose a 
from a uniform distribution on (0, 1). Curiously, integrating ^4 as a varies 
over (0, 1) gives C(2) - 1 = 7rV6 - 1 = .644934068 . . .. 

Proof of Theorem 2 We refers here to the same two sorts of diagrams as 

the last proof, but now we assume them square^ just so that the average per 
row excess of green area over red area equals the average per column excess. 

We wish to compare, asymptotically, the the average per row excess of 
green area over red area in the two types of square diagrams. 

Since we have a uniform bound on the excess that occurs in any single 
row, we can safely ignore the green circles entirely! The green circles occur 
in only ^/ml rows, so the variation in green areas between the two diagrams 
will tend to vanish when we divide by n and let n grow. (Compare with the 
previous proof, where the condition on F meant that switching to a row-by- 
row analysis ultimately allowed us to ignore everything. Even with F — ax, 
the old reasoning still applies to ^/F.) 

As for the variation in the red area between the two diagrams, we employ 
a straight-forward integral approximation, getting 

r{[l/y\-[l/a\)-^-^dy. 
Jo y 

The first term, in parentheses, captures the contribution for the second dia- 
gram, and from this we subtract off the contribution from the first diagram. 
Specifically, we have estimated the average per row excess of the red area in 
the second diagram over red area in the first. 

Please note, for clarity, that since red dots count negatively, and by the 
remark above concerning the possibility of ignoring the green area, this ex- 
pression also estimates the amount by which the average per row excess of 
green area over red area in the first diagram exceeds the average per row 
excess of green area over red area in the second diagram. 
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After some routine calculation, the integral in question evaluates tqj 

1 I Q I 1 

Since we know that the first diagram has an average per row excess of green 
area over red area equal to 7, while we seek the corresponding information 
for the second diagram, the result follows when we subtract this quantity 
from 7. 



7 Final Remark 

Even though we set as our original goal the crafting of novel interpretations 
for 7, a great variety of curious statements arise when we force 7 to leave 
the story. Here we give just one example. By Theorem 1, a number n tends 
to have 7 more divisors in (0,?7,^/^) than in (n^/'*, n^/^), and likewise 7 more 
divisors in (0,?7,^/^) than in (n^/^, n^/^). Subtract these two differences, we 
see that: 

on the average, n has exactly twice as many divisors in (n^/^, n}/"^) 
as it does in (n-^/^, n^/*^). 

Since 7 no longer appears in the statement, one should naturally enquire 
about the possibility of a 7-free proof. 



^In the case of a = 1, mechanical evaluation of this integral constitutes the essence of 
a proof of the theorem of de la Vallee Poussin mentioned in the introduction - 7 emerges 
directly from the definition in the form of its usual definition. But from the pairing of 
divisors of n above and below ^Jn we actually know the value of the integral in advance, 
albeit just in this case. That means we have actually have in hand two independent proofs 
of de la Vallee Poussin's theorem. 
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